R for faculty, Fall 2024
Mathematics: A function maps each \(x\) in \(A\) to some unique \(y\) in \(B\).
Euclidean norm on \(R^2\). \(||x|| = \sqrt{x_1^2+x_2^2}\).
Then we call the function:
Functions can also be written using a shorthand notation:
When functions fit one on line you don’t need the curly braces.
\() vs function()\() notation is quite new, introduced in 2021.R code uses function().\(). It’s shorter! function is just too long.Look at the norm function once more:
Here x1, x2, and x3 are the arguments of the function
You can compose multiple function calls. For example, you can compute the biased sample variance using sqrt() and mean():
The biased sample variance equals
Let’s get comfy with functions through exercises.
Write a function summer that calculates the sum of the first \(n\) natural numbers, i.e., 1,2,\ldots,n.
Write a function hn that calculates the \(n\)th harmonic number, i.e., \(H_n=1/1 + 1/2 + 1/n\). (Hint: What happens to 1/seq(n)?)
A famous approximation to the \(n\)th harmonic number states that \(H_n \approx \log(n)+\gamma\), where \(\gamma\) is the Euler–Mascheroni constant. (-digamma(1) in R).
Write a function h_n_approx that returns the approximation to \(H_n\). Verify it on n=1000.
Write a function is_awesome(x) that returns "LOL" if its input is "MrBeast" and "Boring!" otherwise. Test the function on "MrBeast" and "William Shakespeare".
Functions are often used to “hide” information from the user, such as loops.
We may sum the values of a vector as follows:
Using for loops, make a function maxi that returns the maximal value of a vector. Test it on x<-c(1,2,10,-1,0). You must use conditionals here.
Programming requires thinking.
We have the tools for solving the next exercise!
Suppose that x is a vector of numbers 1,2,...,n, but one number less than n is missing. Make a function get_missing that returns the missing number.
Suppose that x is a vector of numbers 1,2,...,n, but with one number. Make a function get_missing2 that returns the missing number.
R returns the last calculated value of a function, but you can also return explicitly.
The values defined in a function are private to that function – they belong to the function’s private scope.
The solution is 0, as the x inside f belongs to that function’s scope.
f returns 3, since a is an argument.
RRs built-in functions.?fun in the terminal) and look up stuff at the internet.In mathematics a function \(f:X\to Y\) is a “rule” that sends an object in \(X\) to an object in \(Y\).
Such functions exist in R and other programming languages too; they are called pure.
But functions can do other stuff as well. Such “other stuff” is called side effects.
A function can access the variables in its defining environment, but will not modify them.
Here f was able to access t=55 from its defining enviroment. But it can’t (easily!) modify the value of t in its calling environment.
<<-You can modify variables in the calling environment using functions. To do this, use <<-. This is very rarely useful!
Take a look at the following function.
What is this?
Make a function square_root(a, x0, eps) that calculates the square root of a using Newton–Raphson. (Here x0 is the starting value and epsilon is the desired accuracy.) Give x0 a default value of 1 and epsilon a default value of 0.000001.
R objects.Roxygen. Documentation comments start with #'.Write a function mad2 that estimates the mean absolute deviation from the median. Write documentation for the function.
... arguments (dot-dot-dot)Look at the documentation ?sum. The argument ... tells R the function takes an arbitrary number of arguments, named or not.
Recall that the maximum likelihood estimator of a density family \(f(x;\theta)\) on the data \(x_1,\ldots,x_n\) is the \(\theta\) that maximimes \(\sum_{i=1}^n \log f(x_i;\theta)\).
The nlm function (non-linear minimization) does function minimization for you. It usually works well, but there are better, bespoke methods.
\() {} or function() {}.Roxygen.R.
Comments and documentation